To conduct an accessibility analysis, we need: 1) OSM Road Data 2) Shapefile of the study area 3) GTFS feed
We use these inputs to make queries on Open Trip Planner
library(osmdata)
library(tidyverse)
# opq used to build an overpass query
# we define the geographic extents of the area we want to query from OSM. I have added the extents of the GCR as a vector
# add_osm_feature() used to specify what we are querying. It takes key, value pairs but here I want all types of highways so I specify the key only
# For a list of specific values to query, see https://wiki.openstreetmap.org/wiki/Key:highway
query <- opq(bbox = c(30.801369, 29.705080, 31.874483, 30.390664)) %>%
add_osm_feature(key = 'highway')
# here we want the data in xml format. Ideally we would use pbf data, as it is more compact and makes the analysis faster, but osmdata_pdf() returns empty files. Issue discussed here https://github.com/ropensci/osmdata/issues/74
osmdata_xml(query, filename = 'greater-cairo.osm')
# the xml file is downloaded to the project working directory
Plot to see if the roads have been downloaded. This takes around 10 minutes and can be skipped
# save query as sp then plot to see if we have the correct area
highways_greaterCairo <- osmdata_sp(query)
sp::plot(highways_greaterCairo$osm_lines)
An alternative to xml is pbf files. These are more compact and presumably make the forthcoming analysis faster. They can be downloaded from the HOT OSM export tool https://export.hotosm.org/en/v3/exports/new/formats?fbclid=IwAR3m_1bDZK2sYsA-jtPRPYGg9R7kTqt5hzjP88x3p2yvRd4i9tr11i2tG60
I tried to use a pbf file but the analysis was not quicker so I stuck to the xml file extracted from r
In this step I import a shapefile of Greater Cairo. The shapefile has the region divided into hexagons with the population and number of jobs in each hexagon
library(sf)
Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
# import the shapefile
cairo_hexagons <- st_read("Cairo Shapefiles/H3-res8/H3res-8_GCR_4326.shp")
Reading layer `H3res-8_GCR_4326' from data source `/Users/Hussein/Documents/Code Repos/GIS-Coursework/Cairo Shapefiles/H3-res8/H3res-8_GCR_4326.shp' using driver `ESRI Shapefile'
Simple feature collection with 1613 features and 22 fields
geometry type: POLYGON
dimension: XY
bbox: xmin: 30.8684 ymin: 29.71863 xmax: 31.83956 ymax: 30.35436
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
It is imported and in the corrected crs EPSG:4326
Let’s plot to check if it looks right
library(tmap)
Registered S3 methods overwritten by 'htmltools':
method from
print.html tools:rstudio
print.shiny.tag tools:rstudio
print.shiny.tag.list tools:rstudio
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
# plot
tm_shape(cairo_hexagons) +
tm_polygons()
It looks right but the area in the top right needs to be removed. This is the 10th of Ramadan City and is not part of the GCR
# Remove 10th of Ramadan (it is in Sharqeya)
# dplyr::filter with !grepl. This removes all rows that contain 10th of Ramadan OR '10 of Rammadan' in nam_tfc column
cairo_hexagons <- cairo_hexagons %>% filter(!grepl('10th of Ramadan|10 Of Rammadan', nam_tfc))
# plot to check that it worked
tm_shape(cairo_hexagons) +
tm_polygons()
The next step is to get the centroid of each hexagon. This is because Open Trip Planner takes queries from lat lon coordinates
# get centroids in seperate feature
h3_centroids <- st_centroid(cairo_hexagons)
st_centroid assumes attributes are constant over geometries of xst_centroid does not give correct centroids for longitude/latitude data
# centroids are provided as geometry but we need the lat and lon in two seperate columns
# I use this function to split c(lat, lon) to two seperate columns
# FROM JM London (https://github.com/r-spatial/sf/issues/231)
# lat = Y lon = X
sfc_as_cols <- function(x, names = c("lon","lat")) {
stopifnot(inherits(x,"sf") && inherits(sf::st_geometry(x),"sfc_POINT"))
ret <- sf::st_coordinates(x)
ret <- tibble::as_tibble(ret)
stopifnot(length(names) == ncol(ret))
x <- x[ , !names(x) %in% names]
ret <- setNames(ret,names)
dplyr::bind_cols(x,ret)
}
# add lon and lat columns to dataframe using sfc_as_cols function
h3_centroids <- sfc_as_cols(h3_centroids)
# two additional columns (lat and lon) have been added to the sfc
h3_centroids
Simple feature collection with 1507 features and 24 fields
geometry type: POINT
dimension: XY
bbox: xmin: 30.87314 ymin: 29.72468 xmax: 31.76349 ymax: 30.28386
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
First 10 features:
index city_nm prnt_idx W3W_id H3_arkm znng_t_
1 883f5b3601fffff 6th of October 873f5b360ffffff towels.dissolve.puppets 1175299 Outer
2 883f5b3603fffff 6th of October 873f5b360ffffff freshest.burden.enacted 1175322 Outer
3 883f5b3605fffff 6th of October 873f5b360ffffff repaid.glaze.dislodge 1175197 Outer
4 883f5b3607fffff 6th of October 873f5b360ffffff ocean.reason.mermaids 1175220 Outer
5 883f5b3609fffff 6th of October 873f5b360ffffff intrigues.garden.tent 1175379 Outer
6 883f5b360bfffff 6th of October 873f5b360ffffff builders.irritate.hockey 1175402 Outer
7 883f5b3611fffff 6th of October 873f5b361ffffff handover.mainly.bends 1175448 Outer
8 883f5b3613fffff 6th of October 873f5b361ffffff picturing.trip.underway 1175470 Outer
9 883f5b3615fffff 6th of October 873f5b361ffffff whirlwind.huddle.encoder 1175345 Outer
10 883f5b3617fffff 6th of October 873f5b361ffffff contained.hobbit.sandpaper 1175368 Outer
nam_tfc Uniq_ID SSc_N_1 n n_bank n_employ n_commer n_health n_recrea n_pub CBD Hus_ratio
1 Industrial Zone 3 89 <NA> 6 0 6 0 0 0 0 <NA> 0.6
2 Industrial Zone 3 89 <NA> 6 0 6 0 0 0 0 <NA> 0.6
3 Industrial Zone 3 Extenstion 550 <NA> 0 NA NA NA NA NA NA <NA> NA
4 Industrial Zone 3 Extenstion 550 <NA> 1 0 1 0 0 0 0 <NA> 0.1
5 Industrial Zone 3 89 <NA> 2 0 2 0 0 0 0 <NA> 0.1
6 Industrial Zone 2 285 <NA> 7 1 6 0 0 0 0 <NA> 0.6
7 Industrial Zone 1 544 <NA> 2 0 2 0 0 0 0 <NA> 0.2
8 Industrial Zone 4 475 <NA> 19 18 0 0 0 0 1 <NA> 7.8
9 Industrial Zone 2 285 <NA> 1 1 0 0 0 0 0 <NA> 0.1
10 Industrial Zone 2 285 <NA> 0 NA NA NA NA NA NA <NA> NA
pop2018cap jobsLFScou area popdenscap geometry lon lat
1 0 10281 1.175 0 POINT (30.90253 29.90795) 30.90253 29.90795
2 0 10281 1.175 0 POINT (30.89874 29.9175) 30.89874 29.91750
3 0 0 1.175 0 POINT (30.89731 29.89941) 30.89731 29.89941
4 0 1713 1.175 0 POINT (30.89353 29.90896) 30.89353 29.90896
5 0 1900 1.175 0 POINT (30.91154 29.90695) 30.91154 29.90695
6 0 10333 1.175 0 POINT (30.90775 29.91649) 30.90775 29.91649
7 0 5305 1.175 0 POINT (30.90017 29.93559) 30.90017 29.93559
8 0 1463 1.175 0 POINT (30.89638 29.94514) 30.89638 29.94514
9 0 52 1.175 0 POINT (30.89495 29.92705) 30.89495 29.92705
10 0 2524 1.175 0 POINT (30.89116 29.9366) 30.89116 29.93660
I use the package by Malcolm Morgan to query OTP through R https://docs.ropensci.org/opentripplanner/index.html
library(opentripplanner)
# OTP expects its data to be stored in a specific structure
# I create a new folder called Open-Trip-Planner in my wd
#dir.create() creates a subfolder called "OTP-Cairo-All"
path_data <- file.path("Open-Trip-Planner", "OTP-Cairo-All")
dir.create(path_data)
Now we need to download OTP and save it to the "OTP-Cairo-All subfolder
# otp_dl_jar function will download the OTP and save it in the folder we created
# The function returns the path to the OTP jar file.
path_otp <- otp_dl_jar(path_data)
The OTP will be saved to Open-Trip-Planner/OTP-Cairo-All/otp.jar
trying URL 'https://repo1.maven.org/maven2/org/opentripplanner/otp/1.4.0/otp-1.4.0-shaded.jar'
Content type 'application/java-archive' length 61838786 bytes (59.0 MB)
==================================================
downloaded 59.0 MB
The next step is to add the GTFS and OSM files to the created folder. Create a subfolder in OTP-Cairo-All called graphs then create a subfolder in graphs called default Add the OSM and GTFS inside ‘default’ I followed this structure:
OTP-Cairo-All (Created Above)
‘graphs’
'default' # Subfolder with the name of the router
osm.pbf # Required OSM road map - ADDED
gtfs.zip # Optional GTFS data - ADDED
I add a GTFS file with all public transport in the GCR (Formal + Informal)
This data is used to build a Graph object which is the base for OpenTripPlanner
# Building an OTP Graph
# This code will create a new file Graph.obj that will be saved in the location defined by path_data.
# memory argument assigns more memory to building graph (to speed it up). R assigns 2GB by default
log <- otp_build_graph(otp = path_otp, dir = path_data, memory = 6000, analyst = TRUE)
2020-01-07 23:35:17 Basic checks completed, building graph, this may take a few minutes
The graph will be saved to Open-Trip-Planner/OTP-Cairo-All
2020-01-07 23:40:19 Graph built
Now we are ready to launch OpenTripPlanner
# Launch OTP and load the graph
otp_setup(otp = path_otp, dir = path_data)
# connect r to otp
otpcon <- otp_connect()
Now that OTP is launched and connected to r, we can start querying
# LOOPING FUNCTION TO GET REACH ISOCHRONE OF EACH HEXAGON CENTROID
# variable with number of hexagons (for looping function)
nrows <- nrow(h3_centroids)
# empty list to store output
reachPolygons<-list()
# get reach polygon for each centroid and store in reachPolygons list
for (i in 1:nrows){
reachPolygons[[i]] <- otp_isochrone(otpcon = otpcon,
fromPlace = c(h3_centroids$lon[i], h3_centroids$lat[i]),
mode = c("WALK", "TRANSIT"),
maxWalkDistance = 1000,
date_time = as.POSIXct(strptime("2019-08-05 09:00", "%Y-%m-%d %H:%M")),
cutoffSec = 3600) # Cut offs in seconds
}
Lets plot one of the isochrones to see if this worked
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(reachPolygons[[568]]) +
tm_fill(col = "antiquewhite2") +
tm_borders()
The shape reachPolygons[[568]] is invalid. See sf::st_is_validrow names were found from a short variable and have been discarded
Now that we have the isochrones, we need to calculate how many jobs each isochrone intersects with.
For each intersected hexagon, we get: (area of intersection / total area of hexagon) * jobs in Hexagon
We then sum all the results to get the number of jobs accessible for the hexagon we are querying from
# empty list to store output
totalJobs <-list()
library(lwgeom) # for st_make_valid (this handles invalid geometries https://github.com/r-spatial/sf/issues/870)
for (i in 1:nrows){
totalJobs[[i]] <- 0 # there are some points that OTP couldn't route from. try() used to assign 0 value to these points
try({
totalJobs[[i]] <- reachPolygons[[i]] %>%
st_make_valid() %>% # some geometries are invlid (this prevents an error)
st_intersection(cairo_hexagons) %>% # intersect reachPolygon with cairo_hexagons
mutate(int_area = (st_area(.)/1000000) %>% as.numeric()) %>% # add column with intersection area with each hexagon
mutate(int_jobs = (int_area/area)*jobsLFScou) %>% # add column with int_jobs: jobs intersected
summarise(total_jobs = sum(int_jobs)) # summarize and get the sum of jobs intersected by reachPolygon
})
}
Now that we have the job reach of each hexagon, we need to transfer these values form the totalJobs list back to the cairo_hexagons sf so that we can calculate accessibility scores and, eventually, plot the results
# add a column for the number of jobs reachable within 60 minutes using all forms of public transport
for (i in 1:nrows){
cairo_hexagons$jobs_60_all[i] <- 0 #set default value = O: will be given to bad geometries
try({
cairo_hexagons$jobs_60_all[i] = totalJobs[[i]][[1]] # add totalJobs to new column in cairo_hexagons called jobs_60_all
})
}
Now lets get the accessibility score for each hexagon. This is the % of the total jobs in the GCR that are reachable from this hexagon
# percentage of jobs accessible from each hexagon
cairo_hexagons$access_per_60_all = ((cairo_hexagons$jobs_60_all/ sum(cairo_hexagons$jobsLFScou))*100)
Calculate the accessibility score for the entire study area
1- weigh each hexagons accessibility score by its population (multiply them)
2- divide the sum by the total population of the GCR
# as.numeric used to return a number instead of a matrix
# Average jobs reached using all modes
as.numeric((cairo_hexagons$jobs_60_all %*% cairo_hexagons$pop2018cap) / sum(cairo_hexagons$pop2018cap))
[1] 276254.4
# Average accessibility Using all Modes (%):
as.numeric((cairo_hexagons$access_per_60_all %*% cairo_hexagons$pop2018cap) / sum(cairo_hexagons$pop2018cap))
[1] 4.71886